Adaptive filter with Riemannian manifold constraint

The adaptive filtering theory has been extensively developed, and most of the proposed algorithms work under the assumption of Euclidean space. However, in many applications, the data to be processed comes from a non-linear manifold. In this article, we propose an alternative adaptive filter that works on a manifold, thus generalizing the filtering task to non-Euclidean spaces. To this end, we generalized the least-mean-squared algorithm to work on a manifold using an exponential map. Our experiments showed that the proposed method outperforms other state-of-the-art algorithms in several filtering tasks.

Adaptive filtering has had several successful practical applications and plays an important role in signal processing. The fields in which adaptive filtering has been applied are system identification, channel equalization, biosignal noise cancellation, and acoustic echo cancellation, among others. The frequent use of adaptive filtering is necessary because signals are often contaminated by noise and unwanted artifacts, such as acoustic echo, powerline interference, and a mother's heartbeat while trying to measure a fetal electrocardiogram. Until recently, most methods for adaptive filtering proposed changes to the objective function to optimize filter coefficients based on the assumption of Euclidean embedding. However, studies have shown that frequently data in signal processing possess latent non-Euclidean structures 1 , indicating that other spaces for filter design may provide more meaningful geometric representations and better signal-processing algorithms. This research expands the use of adaptive filtering beyond Euclidean domains by extending the least-mean-squared (LMS) algorithm to Riemannian manifolds. To accomplish this, we restrict the filter coefficients to exist within a specific manifold. As a result, the proposed optimization algorithm requires that the optimization steps occur on this particular structure.
The theory of adaptive filtering has been the subject of significant research, leading to the development of numerous algorithms. This paper seeks to extend the scope of adaptive filtering to non-Euclidean domains. Currently, most algorithms are designed for Euclidean domains and assume Gaussian noise. In practice, there can be other types of noise and interference that affect the performance of the algorithms. To address this issue, new algorithms have been proposed that modify the cost function by incorporating different schemes. For example, some works have proposed changing the error criteria in the optimization of the filter, such as the information theoretic criterion of minimum error entropy, which allows for better treatment of complex noise distributions 2,3 . Another concept related to Renyi's entropy 4 , mixed correntropy, uses a convex linear combination kernel of two Gaussian functions. In this scheme, the maximum mixing correntropy criterion is used as the cost function; the result is a flexible and robust filter, that exhibits good performance in some scenarios. Furthermore, techniques based on compressive detection (CS) have also been incorporated, giving rise, for example, to zero attraction algorithms, in which a penalty is imposed that allows sparsity in the cost function. For example,   5 introduced a bias-compensation vector to compensate for the bias resulting from an input with noise. For this, an l 1 -norm penalty is used in the cost function that favors sparsity.
The LMS algorithm was designed to work with filter coefficients varying freely in R n , however, practical scenarios often involve filter coefficients subject to constraints imposed by physical characteristics or other influencing factors. As a result, the LMS algorithm is inadequate for determining the filter coefficients when they are limited to a specific subset of the Euclidean space R n . Instead, constrained optimization methods are employed 6 . Some examples of LMS with restrictions arise naturally in various applications such as array processing, spectral analysis and blind multiuser detection where the filters coefficients are subject to a set of linear equality constraints. To address this, the LMS algorithm with a linear equality constraint was proposed in 7,8 , achieving a better performance, while box constraints were utilized in 9,10 where it was further extended to bounded norm constraints with l 2 , l 1 , l ∞ norms. In 11,12 an adaptive filter algorithm incorporating quadratic equality constraint was introduced. The work of 6  www.nature.com/scientificreports/ hypercube in R n (box-constrains) and the bounded hypersphere which led to the development of a quadratically constrained algorithm. The work of 13 extends the LMS method to incorporate filter coefficients constrains specified by general sets of constraints defined by convex functions. Particularly, 14 demonstrates that their proposed constraint/regularization methods effectively ensure the filter parameters to satisfy the constraints, resulting in superior performance compared to the traditional LMS algorithm. In the literature mentioned earlier, several commonly encountered sets of constraints can be understood as Riemannian manifolds. For example, the task of minimizing a general function f : R n → R over the hypercube in R n can be reformulated as a minimization problem on a manifold, as stated in 15 . Likewise, optimizing f : R n → R subject to quadratic inequality constraints corresponds to optimization on the hypersphere, which constitutes a Riemannian manifold. Furthermore, the scenario involving linear constraints can be straightforwardly translated into an optimization problem on the linear manifold defined by the set of linear constraints. Therefore, in several cases, including the aforementioned ones, the LMS algorithm with constrains can be formulated as an optimization method on a manifold, thereby leveraging the advantages provided by such techniques 16 .
In this research, we explore the practical benefits of utilizing novel non-Euclidean spaces for filter design. We anticipate that the community will further advance this approach theoretically and uncover new applications for it. To our knowledge, the only similar work is from Bonnabel et al. 17 , which in the context of learning problems for classification and clustering, proposes a generalization of LMS to the particular case of the manifold of lowrank positive semidefinite matrices. Their proposed algorithm does not explicitly use the exponential map in the optimization process. In contrast, our research focuses on signal processing and expands the LMS algorithm to a broader context where the filter coefficients exist on a geodesically complete Riemannian manifold. To perform optimization on a manifold, we use the exponential map. Our study demonstrates that the algorithm converges in this new context.
The rest of the paper is organized as follows. In the Mathematical Preliminaries section, the main mathematical concepts from Riemannian geometry and the stochastic gradient descent used in this work are introduced. In the Methods section, we review the LMS and normalized least mean square (NLMS) algorithms and their geometric interpretations, and then generalize these results to other spaces. We then propose an algorithm for adaptive filtering on varieties based on LMS and the exponential map. In the Results section, a comparison of the performance of the proposed method against other algorithms for various tasks is presented. Finally, in the Conclusions section, a description of the results obtained is presented.

Mathematical preliminaries
This section, presents the fundamentals of Riemannian geometry underlying the optimization theory over manifolds. Also, a brief introduction to the stochastic gradient on manifolds is presented. In this work, the manifolds M under consideration are submanifolds embedded in the Euclidean space R n for some n. We start by giving some basics definition, which can be found in several books on Riemannian geometry 18-21 , among others. That is, v ∈ T x M if and only if there exists a smooth curve on M passing through x with velocity v. It can be proved that T x M is a lineal space of the same dimension as the manifold M . Next, the concept of disjoint union of all the tangent spaces of the manifold is formalized.

Definition 2 (Tangent bundle)
The tangent bundle of a manifold M is denoted TM and is defined as: On the tangent space T x M an inner product �·, ·� x : T x M × T x M → R can be defined, it induces a norm �u� x = √ �u, u� x . When the metric �·, ·� x varies smoothly, it defines a Riemannian metric.

Definition 3 (Riemannian metric)
Given a smooth manifold M , a Riemannian metric is a correspondence that associates points x ∈ M to inner products �·, ·� x that varies smoothly with x. In other words, for all smooth vector fields X, Y on M the function s : M → R defined as s(x) = �X(x), Y (x)� x is smooth. A manifold with a Riemannian metric is called a Riemannian manifold.
In our context, where M is a submanifold embedded in the Euclidean space R n for some n, we have that �·, ·� x is the inner product of R n .
Given  Given a smooth function f : M → R , the gradient ∇f is uniquely defined. Moreover, the gradient ∇f can be calculated from a smooth extension f : R n → R of f as where Proj x is the orthogonal projection from R n to T x M.
Typically, the geodesic curves are defined in terms of the covariant derivative; here we give a definition based on the fact that the manifold is embedded in a Euclidean space, see Boumal 20 .
Definition 5 (Geodesics) On a Riemannian manifold M , a geodesic is a smooth curve c : This Lemma guarantees the existence of a unique geodesic that passes by p with velocity v satisfying v p < ǫ . From this, the exponential map Exp : U ⊂ TM → M can be defined as It is common to consider the restriction of Exp to the tangent plane in the following way, given a point q ∈ M , the function Exp q : is an open ball with radius ǫ as in Lemma 1. Exp q (v) can be interpreted as a point of M obtained by walking along the geodesic for a time equal to one unit with velocity v q .
The value of ǫ can be chosen such that Exp q : B(0, ǫ) → M is a diffeomorphism over its image, the supreme of the values of ǫ such that Exp q is still a diffeomorphism defines the injectivity radius. A formal definition is presented below Definition 6 (Injectivity radius) The injectivity radius of a Riemannian manifold M at x is denoted as inj(x) . It is defined as the supremum of values of ǫ > 0 such that the exponential map is a global diffeomorphism from B(0, ǫ) ⊂ T x M over its image on M.
The injectivity radius of M , is defined as the minimum value of inj(x) , this is, Observe that in a geodesically complete Riemannian manifold, the exponential maps are defined in the entire tangent plane. Some examples of geodesically completed manifolds are the sphere S n , the n-dimensional torus T n , and the hyperbolic space H n−1 . In this work, we will restrict to geodesically complete manifolds. Now we review the stochastic gradient descent algorithm (SGD) and its Riemannian version. Given a cost function defined as the expected value of the loss function f(z, w) with respect to the variable z, the gradient of C(w) is expressed as  (2) in the minimization of C(w), the SGD algorithm starts from an initial value w 1 and interactively takes samples z k of the variable z and for each sample calculates a new value of w by the following iterative scheme The convergence of the SGD algorithm have been proved by Bottou in 22 . This method can be generalized to a Riemannian manifold M where the optimization problem is stated as under this context, the iterative scheme (4) is carried out by means of the exponential map w k+1 = exp w k (−ρ k h(z k , w k )) which allows to move over the manifold. The algorithm is summarized below. The convergence of the SGD on a Riemannian manifold was first studied proved by 23 , and subsequently several works such as Zhang et al. 24 , Tripuraneni's et al. 25 , among others have demonstrated the convergence of modifications of the above algorithm. The next Theorem is due to Bonnabel 23 and guarantees convergence on quite general conditions on M and f.

Theorem 1 (Bonnabel)
Consider the SGD Algorithm 1 on a connected Riemannian manifold M with an injectivity radius uniformly bounded from below by I > 0 . Assume the sequence of step sizes {ρ k } ∞ k=1 satisfies the standard condition.
Suppose that: 1. There exists a compact set K ⊂ M such thatw k ∈ K for all k.
2. There exists a constant A > 0 such that ∀w ∈ K and ∀z ∈ Z we have h(z, w) ≤ A , where h(z, w) is the Riemannian gradient of the loss function f(z, w).

Methods
As shown in Fig. 1, the algorithms for adapting a filter to the desired conditions seek to find the filter coefficients w = [w(1), w(2), ..., w(n)] T that minimize the error e k between the filter output and a desired signal d k .
In this work, the implementing of an adaptive algorithm with filter coefficients embedded in a Riemannian manifold is proposed. In order to implement gradient descent, the exponential map was used, as in Sun et al. (2019) 26 . Next, we present the insights that led us to this viewpoint.
The least-mean-square algorithm. We start by reviewing the LMS algorithm, which can be seen as an application of the SGD to the adaptive filter problem (Fig. 1). The target of the LMS algorithm is to minimize the average error www.nature.com/scientificreports/ with respect to the filter coefficients w, where x is the input signal and d is the output signal. Moreover, it shall be considered that the output signal d is related with x by the equation d =w T x + η for some w ∈ R n , where η is a zero mean random variable with variance σ 2 , which is independent of x. Note that the function (6) has the same form as Eq. (1) with loss function f (d, x, w) = 1 2 (d − w T x) 2 . Consequently, applying the SGD algorithm produces the update of the filter coefficients w k at iteration k as where ρ k > 0 is the step size, x k is the input or reference signal and e k = d k − w T k x k is the filter error at iteration k. An approach to eliminating dependency in the "volume" of the signal, known as NLMS, suggests using the update Eq. (9) instead of Eq. (7) where � · � 2 is the Euclidean norm. The NLMS algorithm can be interpreted from a geometric viewpoint as projecting the current estimate w k on a hyperplane p k+1 to find the next estimate w k+1 27 (see Fig. 2a). The hyperplane p k+1 is defined as which corresponds to the plane containing all points w, which make the error equal to zero. This process of finding a new plane and projecting onto it is performed at each iteration.
Here, we propose to view the iterative process as a whole, and instead of having a sequence of planes, we now have a manifold where each hyperplane is actually a tangent plane at a point on the manifold. Thus, instead of projecting into planes, what we need is a geodesic on the manifold in the direction of the error. This is illustrated in Fig. 2b, where, for a path from p 1 to p 2 (continuous line), a point in the path can be outside the manifold, not occurring for a geodesic curve (dashed curve).
The proposed LMS implementation is similar to the classic LMS algorithm (Eq. 7). However, we assume that the filter coefficients w are constrained to a smooth Riemannian manifold M , which generalize other types of restrictions imposed in previous works [8][9][10] . Therefore, in this context, the LMS algorithm aims to solve the following optimization problem: where the Riemannian manifold M embedded in the Euclidean space R n for some n. which is endowed with a Riemannian metric which is the Euclidean inner product inherited from the ambient space R n . www.nature.com/scientificreports/ As in the Euclidean LMS, the signal d satisfies that d =w T x + η , where w ∈ M and η is a random variable independent of x with zero means and variance σ 2 . The calculation of the filter output is performed as usually as a convolution of the FIR structure w k with the input x k . To minimize MSE(w) (Eq. 6) with w ∈ M , the proposed LMS algorithm starts at an initial point w 1 ∈ M and progressively produces a sequence of filter values {w k } ∞ k=1 ⊂ M in the same fashion as the SGD algorithm on manifold 1. Given a w k ∈ M and a sample point (d k , x k ) , the method computes the negative Euclidean gradient of f (d k , x k , w) = 1 2 (d k − w T x k ) 2 as and the Riemannian gradient as the projection of Eq. (12) onto the tangent plane T w k M Then, the next point w k+1 is obtained by moving ρ k > 0 units along the geodesic γ w k (t) = Exp w k (tv k ) with velocity v k , i.e.,

This algorithm is summarized below
We implemented the proposed algorithm for two manifolds, the hypersphere S n−1 and the hyperbolic n − 1 -space H n−1 embedded in R n . Now, we present the specific expressions of the projection operators and exponential maps.
The hypersphere is defined as S n−1 = {w ∈ R n : �w, w� = 1} with Riemannian metric the Euclidean inner product inhered from R n . Given w ∈ S n−1 , the tangent plane at w is T w S n−1 = {v ∈ R n : �v, w� = 0} , the orthogonal projection of R n onto T w S n−1 is and the exponential map is that can be founded in Boumal 20 .
The hyperbolic space is defined as H n = {y = (y 0 , y 1 , . . . , y n ) ∈ R n+1 : �y, y� H n = −1, y 0 > 0} with Riemannian metric �u, v� H n = u T Jv known as the Minkowski inner product, where J = diag (−1, 1, . . . , 1) . The projection onto the tangent plane T w k H n is given by To prove the convergence of algorithm 2, it is tempting to apply Theorem 1; however, this cannot be applied directly since the condition that the gradient Proj w (h(x, d, w)) is uniformly bounded for every pair (d, x) is not satisfied. However, following the ideas of the proof of Theorem 1 in Bonnabel 23 the convergence of the proposed method can be proved with quite a general conditions of the random variable x as shown in the following Theorem. www.nature.com/scientificreports/ Lemma 2 Let M be a geodesically complete manifold embedded in R n with Riemannian metric the traditional euclidean inner product inhered from R n . Let MSE : M → R as in Eq. (6). Assume that M = E x �x� 2 xx T and E x x 2 exist and that M is a strictly positive defined matrix. Consider that the Algorithm 2 is applied to MSE(w) with the sequence of step sizes {ρ k } ∞ k=1 satisfying the standard condition (Eq. 5) and also assume that there exists a compact subset K ⊂ M such that w k ∈ K for all k. Then MSE(w k ) converges a.s. and ∇MSE(w k ) → 0 a.s. Proof 1 A well-known fact is that for any v in a linear space E with inner product, the norm of the orthogonal projection Proj V (v) onto a linear subspace V ⊂ E is smaller than the norm of v, this is Therefore, defining h (x, d, w) = Proj w (h(x, d, w)) , we obtain Consequently Using that d =w T x + η , where w ∈ M , we get Therefore, by the independence of η and x and that E η [η] = 0 , then the following inequality holds Then where * > 0 is the greatest eigenvalue of E x x 2 xx T . Since w ∈ K and K is a compact set, then there exist some positive constant C ′ > such that �w − w� 2 ≤ C ′ , ∀w ∈ K . This entails that Since M is geodesically complete, the exponential map Exp w k (tv) is well-defined for all t ∈ R and v ∈ T w k M , then, applying Taylor formula argument the same inequality (5) in 23 where ∇MSE(w k ) is the Riemannian gradient of MSE(w k ) as in the Definition 4 and K > 0 is an upper bound of the eigenvalues of Hessian of the MSE function. Taking the expected value to both sides of Eq. (26) with respect to the sigma-algebra F k = {x 1 , η 1 , ..., x k−1 , η k−1 } and following the same arguments as in the proof of Theorem 1 23 , we get Applying inequality (Eq. 25) to the last term of the previous inequality we obtain where C ′′ = KC ′ . The rest of the proof follows exactly the same as in the proof of Theorem 1 in 23 . Therefore, it is concluded that MSE(w k ) converges a.s. and ∇MSE(w k ) → 0 a.s.
We remark that for a compact geodesically complete Riemannian manifold the compact set K in Lemma 2 can be considered as the whole manifold. The Hyperbolic space H n is a geodesically complete manifold embedded in the euclidean space R n+1 but it does not inhered the inner product of R n+1 as its Riemannian metric, therefore, the Lemma 2 can not be applied directly. In Lemma 3, we provide a proof of the convergence of LMS algorithm when the filter coefficients are constrained to the Hyperbolic space H n . We recall that the Hyperbolic space is defined as H n = {y = (y 0 , y 1 , y 2 , . . . , y n ) ∈ R n+1 : �y, y� H n = −1, y 1 > 0}, where J = diag (−1, 1, ..., 1) . For u = (u 0 , u 1 , ..., u n ) ∈ H n , the following inequality holds where u is the euclidean norm. The following Lemma guarantees the convergence of the LMS algorithm on the n-dimensional Hyperbolic space Lemma 3 Let H n be the n-dimensional Hyperbolic space. Let MSE : H n → R as in Eq. (6). Assume that M = E x �x� 2 xx T and E x x 2 exist and that M is a strictly positive defined matrix. Consider that the Algorithm 2 is applied to MSE(w) with the sequence of step sizes {ρ k } ∞ k=1 satisfying the standard condition (5) and also assume that there exists a compact subset K ⊂ H n such that w k ∈ K for all k. Then MSE(w k ) converges a.s. and ∇MSE(w k ) → 0 a.s. Proof 2 Let us start by proving that given w ∈ H n , the projection satisfies with z the euclidean norm.
By Eq. (17) we have z ′ = Proj w (z) = z + �w, z� H n w , then Using the fact that �z, z� H n ≤ �z� 2 (corresponding to inequality (Eq. 31)) we obtain that Applying 2ab ≤ a 2 + b 2 and the Cauchy-Schwartz inequality to Eq. (32) we get Therefore, taking z = h(x, d, w) and z ′ = Proj w (h(x, d, w)) it is obtained This inequality allows the bound E d,x �h(x, d, w)� 2 H n w ≤ C as w ∈ K . From this upper bound of E d,x �h(x, d, w)� 2 H n w , we can proceed following the steps outlined in Lemma 2 and the Theorem 1 in 23 .

Results
In this section, the results of the experiments are presented. The proposed algorithm was implemented using the geostats library 28 . For the comparison methods, the following methods were used: the LMS 29 , the kernel LMS (Kernel) 30 , the bias compensated zero-attracting normalized least mean square adaptive filter (BCZA) of 5 , and the normalized LMS adaptive filter with a variable regularization factor (NNLMS) of 31 . In all the experiments, a simple grid search to establish the parameters of the algorithms for each task was used. In all cases, the assumed manifold had a dimension equal to the filter order. The learning curves averaged over 100 realizations.
Scientific Reports | (2023) 13:9014 | https://doi.org/10.1038/s41598-023-36127-y www.nature.com/scientificreports/ As a preliminary experiment, we introduce the identification of two systems whose models possess a spherical and hyperbolic manifold structure, respectively. The systems are represented using Finite Impulse Response (FIR) filters, with their coefficients derived from the corresponding manifold. The size of each filter is six dimensions. The signals were generated by convolving with a square wave of random and varying periods. Subsequently, the different methods were evaluated, and their respective learning error curves were plotted.
Results for the spherical filter are as follow, the Kernel method achieved a mean squared error (MSE) of −20.45 dB, the BCZA method attained −63.18 dB, NNMLS obtained −82.07 dB, LMS reached −79.07 dB, and the proposed method achieved −117.57 dB. Figure 3a presents learning curves, demonstrating that the proposed filter achieves the lowest error compared to all other methods, leveraging the inherent structure of the synthetic example. Figure 3b-f exhibit the performance of each filter across 100 realizations of the data. It is also evident from these figures that the proposed method closely tracks the system output. Additionally, it is worth noting that all evaluated methods display high variance in the peaks and flat regions.
For the system characterized by a hyperbolic structure, the Kernel method achieved a mean squared error (MSE) of −42.41 dB, the BCZA method attained −63.18 dB, NNMLS obtained −86.03 dB, LMS reached −60.22 dB, and the proposed method achieved −89.50 dB. In Fig. 4a, the learning curves for each method are depicted, with the proposed method demonstrating superior performance compared to the others. Figure 4b-f showcase the system response alongside the response of each method. Notably, the proposed method exhibits superior performance in this case.
The next experiment, the Mackey-Glass chaotic time series was used for prediction. Each data d k was predicted using x k = [d k−3 , d k−4 , ..., d k−n−3 ] T for filter order, n = 11 . In addition, each d k was further contaminated with zero mean white Gaussian noise with a 0.001 variance before it was compared with the filter output w T k x k . For this experiment, the proposed filter assumed filter coefficients on a hyperbolic manifold. Figure 5a shows the learning curves for each method. Figure 5b is a zoomed-in image of iterations 2500-2580. The proposed method achieved less MSE, −126.39 dB on average, followed closely by the NNMLS algorithm, with −123 dB, and the LMS with −60.24 dB. However, it can be seen in Fig. 5b that both learning curves were separated by more than 10 dB most of the time. Figure 5c-g; illustrate the performance of the different methods to approximate the Mackey-Glass series. The displayed outcomes represent an average of 10 trials. Moreover, the confidence interval for each point on the curve is computed by utilizing 1.96 standard deviations achieved by the algorithms. It is apparent that a most of algorithms display greater variability at the local maxima and minima of the curves in the time series. Conversely, the method proposed exhibits reduced variance and more precisely tracks the signal.
In another experiment, the task was interference cancellation. To this end, simulated fetal electrocardiogram (fECG) data were used, the signal came from the database of simulated mother's ECG (mECG) and fECG signals 32 sampled at a rate of 250 samples per second. Figure 6f shows the fECG mounted on a direct current for illustration purposes, and the abdominal ECG signal, which was composed of the mECG and fECG. The signal d k consisted of the ECG at the mother's belly (mECG+fECG), the reference signal consisted of the mECG, and www.nature.com/scientificreports/ the fECG was obtained as the error signal. For this experiment, the proposed method assumed that the filter coefficients are on a hyperbolic manifold, and the filter order for all the algorithms was 21. Figure 6a-e, shown the recovery of the fECG for each method. The right graph shows a zoomed-in image of the response. It can be seen that the proposed method had less MSE and recovered fECG with fewer distortions compared to the other methods. For the next experiment, the task at hand was system identification. The data used were from an air heater to be installed in a production line. The data were acquired at one sample per second. The input signal was a digital signal activating a relay, while the output signal was from a temperature sensor. The proposed filter assumed a hypersphere manifold. The size of all filter was three. The resulting learning curves for the different methods are presented in Fig. 7a. Once again, the proposed method exhibited superior results, achieving a mean squared error (MSE) of −74.34 dB. In comparison, the Kernel filter attained −36.44 dB, the BCZA method reached −31.01 dB, NNLMS with −63.81 dB, and the LMS method achieved −37.72 dB. Furthermore, in Fig. 7b-f, the system and filter responses for 100 realizations are depicted. It is evident that the proposed method outperforms the others, closely following the signal.
Finally, the following experiment aims to assess the method's sensitivity to the learning rate ρ . To accomplish this, a low-pass FIR filter consisting of two coefficients was employed as the system, with an input analogous to that of the preceding experiment. The ρ parameter was varied at multiple values: 0.5, 0.1, and 0.05. As seen in Fig. 8, the duration of the output's stabilization transient increases as the parameter decreases, and it is also evident that the error diminishes as ρ decreases.

Conclusions
In this work, an adaptive filter algorithm is proposed. Instead of assuming Euclidean embedding we supposed that the best filter coefficients were embedded in a manifold. We modified the well-known LMS algorithm, considering a manifold with a known structure. We proved the effectiveness of the proposed method for interference cancellation, prediction, and system identification tasks. The results obtained by all the methods showed that the proposed method outperformed all the other methods. The future work should include selecting the right manifold type for the task and using a variable regularizer parameter.

Data availibility
The data presented in this study are available upon request. Please contact the corresponding author.